Guppy colour learning project site GitHub Analysis 2 Analysis 1 Methods Home


Overview

This page reports the analyses for the second experiment described in ‘Colour biases in learned foraging preferences in Trinidadian guppies.’ The code run to produce the results is included on the page along with explanations of what the code is doing and why. The raw R script to reproduce the data preparation, analysis, figures, and this page are in analysis-experiment-2.Rmd. Note the code blocks that produce the figures and tables are not shown on this page as they are rather long, however the code to produce the figures and tables can also be seen in analysis-experiment-2.Rmd. To get straight to the results go to the Models section. To see how to reproduce these results please visit the How to Reproduce the Results section of the README.


Data preparation

In this section we detail the steps taken to process the raw data produced by processing video footage with automated tracking from Noldus EthoVision (Noldus et al., 2001). The raw data can be found in the data/experiment-2-raw-data/ directory. They are composed of .xlsx files exported from EthoVision XT Version 11. Each trial is in a separate .xlsx file. The full processed data are available as a .csv file in the file colour-learning-experiment-2-full-data.csv. Descriptions of the variables found in the data set are given in the variable descriptions section of the README file.

Downloading data

To prepare the data first we download the raw data files from the Google Drive folder they are stored in. We make use of the tidyverse package googledrive to do this. We put googledrive into a de-authorized state so we can access public Google Drive resources without a Google sign-in. We then get the list of files that are present in the Google drive directory and use a for() loop which downloads each file using the drive_download() function. The data are downloaded to the data/experiment-2-raw-data/ directory.

# Downloading data from Google drive

## Put googledrive into a de-authorized state
drive_deauth()

## Store link to data folder
data_folder_link <- "https://drive.google.com/drive/folders/1A8NRlBMQ-BfkgNHzEpmw6hEbgePJncLj?usp=sharing"

## Get id for data folder
data_folder_id <- drive_get(as_id(data_folder_link))

## Store the list of file names and ids found in the data folder
data_files <- drive_ls(data_folder_id)

## Loop through and download each file
for (file_x in 1:length(data_files$name)) {
      drive_download(
        as_id(data_files$id[file_x]),
        path = str_c("data/experiment-2-raw-data/",data_files$name[file_x]),
        overwrite = TRUE)
}

Formatting data

Next we read in and format the raw .xlsx files from EthoVision which are in data/experiment-2-raw-data/ using one of my custom functions, read_and_format_ethovision_data(). The code for this can be seen in read-and-format-ethovision-data.R.

# Reading in Data
full_data <- read_and_format_ethovision_data("data/experiment-2-raw-data/")

Next we add the rewarding object colour treatments to the correct guppy IDs that were established a priori in datasheet-experiment-2.Rmd. The treatments are represented by the variable rewarding.object.colour.

## Assigning treatments
full_data <- full_data %>%
  mutate(
    rewarding.object.colour =
      case_when(
        id == "1a" ~ "blue",
        id == "1b" ~ "green",
        id == "2a" ~ "blue",
        id == "2b" ~ "blue",
        id == "3a" ~ "blue",
        id == "3b" ~ "green",
        id == "4a" ~ "green",
        id == "4b" ~ "green",
        id == "5a" ~ "green",
        id == "5b" ~ "blue",
        id == "6a" ~ "green",
        id == "6b" ~ "green",
        id == "7a" ~ "blue",
        id == "7b" ~ "blue",
        id == "8a" ~ "green",
        id == "8b" ~ "blue"
      )
  )

All the variables for the data set are read in as characters due to the read_excel() call in read_and_format_ethovision_data(), so we need to convert them to their appropriate data structures for the analysis. Variables are converted to either factors or numerics where appropriate using the lapply() function which applies a function over a vector. We apply the as.factor() function to categorical variables identified in the Factors vector and the as.numeric() function to the numerical variables identified in the Numerics vector.

For the latency measures, dashes in the raw data sheet indicate that an individual never visited the zone of interest. In being converted to numerics these values are changed to NAs. We convert these values to the maximum value which is the trial duration (300 seconds) using the tidyr function replace_na().

# Converting variables

## Factors
Factors <- c("ate", "id", "object.side", "rewarding.object.colour", "object.pair")
full_data[Factors] <- lapply(full_data[Factors], as.factor)

## Numeric
Numerics <- c(
  "trial", "left.object.visits", "time.with.left.object",
  "left.object.latency", "right.object.visits", "time.with.right.object",
  "right.object.latency", "periphery.visits", "time.in.periphery",
  "latency.to.periphery", "center.visits", "time.in.center",
  "latency.to.center", "distance.moved", "mean.velocity"
)
full_data[Numerics] <- lapply(full_data[Numerics], as.numeric)

## Latency NA replacement
full_data <- full_data %>%
  replace_na(
    list(
      left.object.latency = 300,
      right.object.latency = 300,
      latency.to.periphery = 300,
      latency.to.center = 300
    )
  )

Variable creation

New variables and measures need to be created from the variables present in the raw data sheets. We do this using the mutate() and case_when() functions from the tidyverse package dplyr. First we invert the object side because the camera image is reversed from the perspective of the experimenter. We then create the variables time.with.trained.object and time.with.untrained.object by identifying whether the left or right object is the reward object.

The preference metrics green.object.preference and rewarding.object.preference are created by subtracting the time spent near the blue object from the time spent near the green object and subtracting the time spent near the untrained object from the time spent near the trained object respectively.

time.with.both.objects is obtained by summing the time spent near the left and the right object. total.time is obtained by summing the time.in.periphery with the time.in.center. total.time should be close to 300 since trials last 5 minutes (300 seconds).

We also create the variable trial.type to identify whether a trial is a test trial (unreinforced) or training trial (reinforced).

# Creating new variables

## Inverting object side
full_data <- full_data %>%
  mutate(
    reward.object.side =
      as.factor(
        case_when(
          object.side == "left" ~ "right",
          object.side == "right" ~ "left"
        )
      )
  )

## Time with trained object
full_data <- full_data %>%
  mutate(
    time.with.trained.object =
      case_when(
        reward.object.side == "left" ~ time.with.left.object,
        reward.object.side == "right" ~ time.with.right.object
      )
  )

## Time with untrained object
full_data <- full_data %>%
  mutate(
    time.with.untrained.object =
      case_when(
        reward.object.side == "left" ~ time.with.right.object,
        reward.object.side == "right" ~ time.with.left.object
      )
  )

## Green object preference
full_data <- full_data %>%
  mutate(
    green.object.preference =
      case_when(
        rewarding.object.colour == "green" ~
        time.with.trained.object - time.with.untrained.object,
        
        rewarding.object.colour == "blue" ~
        time.with.untrained.object - time.with.trained.object
      )
  )

## Rewarding object preference
full_data <- full_data %>%
  mutate(
    rewarding.object.preference =
      time.with.trained.object - time.with.untrained.object
  )

## Proportionanl Rewarding object preference
full_data <- full_data %>%
  mutate(
    prop.rewarding.object.preference =
      time.with.trained.object / (time.with.trained.object + time.with.untrained.object)
  )

## Time with both objects
full_data <- full_data %>%
  mutate(
    time.with.both.objects =
      time.with.left.object + time.with.right.object
  )

## Total time
full_data <- full_data %>%
  mutate(
    total.time =
      time.in.center + time.in.periphery
  )

## Trial type
full_data <- full_data %>%
  mutate(
    trial.type =
      as.factor(
        case_when(
          trial == 0 | trial == 21 ~ "test",
          trial > 0 & trial < 21 ~ "training",
          trial == 22 | trial == 24 |  trial == 26 | trial == 28 ~ "refresher",
          trial == 23 ~ "test",
          trial == 25 ~ "test",
          trial == 27  ~ "test",
          trial == 29 ~ "test"
        )
      )
  )

## Assigning weights
full_data <- full_data %>%
  mutate(
    weight =
      case_when(
        id == "1a" ~ 0.29,
        id == "1b" ~ 0.10,
        id == "2a" ~ 0.20,
        id == "2b" ~ 0.11,
        id == "3a" ~ 0.20,
        id == "3b" ~ 0.12,
        id == "4a" ~ 0.21,
        id == "4b" ~ 0.11,
        id == "5a" ~ 0.18,
        id == "5b" ~ 0.13,
        id == "6a" ~ 0.15,
        id == "6b" ~ 0.11,
        id == "7a" ~ 0.31,
        id == "7b" ~ 0.13,
        id == "8a" ~ 0.17,
        id == "8b" ~ 0.14
      )
  )

We next create subsets of the full data set that are restricted to the training trials (reinforced), the test trials (unreinforced), and the initial test trial (unreinforced) using the filter() function from dplyr. We change trial to a factor for the unreinforced test trial data subset since there are two levels of trial being compared to each other for the analysis on this data set.

# Restrict data to only the baseline data
baseline_data <- full_data %>%
  filter(trial == 0)

# Restrict data to training data
training_data <- full_data %>%
  filter(trial.type == "training")

# Restrict data to only the baseline and re-test data
test_data <- full_data %>%
  filter(trial.type == "test")

# Change trial to factor for test trials
test_data$trial <- as.factor(test_data$trial)

# Change trial to integer for training trials
training_data$trial <- as.integer(training_data$trial)

Exporting processed data

Finally we export the full data set as a .csv file to future proof the full data sheet in a plain text, machine-readable format. row.names is set to FALSE so that the index column is not exported into the .csv file.

write.csv(full_data, 
          file = "data/colour-learning-experiment-2-full-data.csv",
          row.names = FALSE)

Models

We analysed the data from our experiment using linear mixed effect and generalized linear mixed effect models with the lmer() and glmer() functions from the lme4 package. P-values and effective degrees of freedom were obtained using the lmerTest package which uses Satterthwaite’s degrees of freedom method (Kuznetsova et al., 2017). Model residuals were checked they met distributional assumptions with the DHARMa package. The ‘See Model Residuals’ button below the model formulas can be clicked to see the residual diagnostic plots produced by DHARMa for that particular model.

Model 1 - Preference for the green object at baseline

This first model contains the data for all individual guppies during the initial test. We looked at the green object preference of all guppies in an intercept only model to see if the green object preference at baseline was significantly different from zero. green.object.preference is the time spent near the green object subtracted by the time spent near the blue object.

baseline_data_model <-
  lm(green.object.preference ~ 1,
    data = baseline_data
  )
simulationOutput <- simulateResiduals(fittedModel = baseline_data_model)
plot(simulationOutput)

# Saving plot to figs directory
ggsave(
  filename = "exp-2-model-1-residual-plot.png",
  plot = (plot(simulationOutput)),
  path = "figs/exp-2/exp-2-residual-plots",
  device = "png",
  dpi = 300
)


Result
Factor Estimate Std. Error T statistic P value
Intercept 0.138 3.786 0.036 0.971

Just as in experiment 1, there is no significant preference for the green object over the blue object across all guppies during the initial test (p = 0.971).

Preference for the green object relative to the blue object across all guppies at baseline. Negative values represent more time spent with the blue object, positive values indicate more time spent with the green object. Data are means ± 95% CI

Figure 1: Preference for the green object relative to the blue object across all guppies at baseline. Negative values represent more time spent with the blue object, positive values indicate more time spent with the green object. Data are means ± 95% CI


Model 2 - Preference for the rewarding object during training

During all of training fish spent on average 68.7% of the trial time near an object during training. 56.3% of trial time was spent near the rewarding object and 12.4% of trial time was spent near the unrewarding object.

To see how fish behaved during training our second model asks whether the preference for the rewarding object changes throughout training and whether the change in rewarding object preference is different between the treatments.

  • Response variable: rewarding.object.preference is the time (seconds) spent near the rewarding object subtracted by the time spent near the unrewarding object
  • Fixed effect: rewarding.object.colour is the identity of the rewarding object (blue or green)
  • Fixed effect: trial is the number of the training trial. In this model it is supplied as an integer
  • Random effect: id is the identity of the individual fish
training_data_model <-
  lmer(rewarding.object.preference ~ trial * rewarding.object.colour + (1 | id),
    data = training_data
  )
# Residual diagnostics
simulationOutput <- simulateResiduals(
  fittedModel = training_data_model,
  n = 1000
)
plot(simulationOutput)

# Saving plot to figs directory
ggsave(
  filename = "model-2-residual-plot.png",
  plot = (plot(simulationOutput)),
  path = "figs/exp-2/exp-2-residual-plots",
  device = "png",
  dpi = 300
)

There is a slight deviation in the lower quantile but no indication in the residual plot of a gross model misfit.


Results
Factor Estimate Std. Error T statistic df P value
Intercept 59.925 26.010 2.304 28.02 0.029
Reward object colour 5.497 1.348 4.079 302.00 < .001
Trial -33.516 36.784 -0.911 28.02 0.37
Rewarding object colour X Trial 5.802 1.906 3.044 302.00 0.003

There was a significant interaction effect between trial and rewarding object colour (p = 0.003) indicating that the change in rewarding object preference has a different trend depending on the rewarding object colour. We used the emtrends() function from emmeans to estimate and compare the trends.

training_data_model_trends <-
  emtrends(training_data_model,
    pairwise ~ rewarding.object.colour,
    var = "trial"
  )
Rewarding object colour Trial trend Std. Error df Lower CL Upper CL
blue 5.497 1.348 302 2.845 8.149
green 11.298 1.348 302 8.646 13.951

Guppies that were trained to green objects increased their relative preference for rewarding objects by 11.3 seconds on average each trial whereas guppies trained to blue objects increased their relative preference for rewarding objects by 5.5 seconds on average each trial. Thus, while both groups increased their preference for their respective rewarding objects over training, green trained guppies increased their preference at a rate that was 2.1x faster than blue trained guppies (Figure 2).

Relative preference for the green object in both treatments during training trials (trials 1-20). Negative values represent more time spent with the blue object, positive values indicate more time spent with the green object. Light lines connect individuals across trials. Subjects were consistently rewarded for approaching the blue object (dashed lines) or the green object (solid lines).

Figure 2: Relative preference for the green object in both treatments during training trials (trials 1-20). Negative values represent more time spent with the blue object, positive values indicate more time spent with the green object. Light lines connect individuals across trials. Subjects were consistently rewarded for approaching the blue object (dashed lines) or the green object (solid lines).


Model 3 - Preference for the rewarded object during testing depending on treatment

We used a generalised linear mixed effects model which allowed us to model the variance associated with trial since variances are heterogeneous between the initial trial and final trial.

test_data_model_glm <-  
  glmmTMB(rewarding.object.preference ~  
            trial * rewarding.object.colour +
            diag(0 + rewarding.object.colour:trial |id), 
  data = test_data %>% filter(trial == "0" | trial == "21"), 
  family = gaussian
  )
simulationOutput <- simulateResiduals(fittedModel = test_data_model_glm, n = 1000)
plot(simulationOutput)

# Saving plot to figs directory
ggsave(
  filename = "exp-2-model-3-residual-plot.png",
  plot = (plot(simulationOutput)),
  path = "figs/exp-2/exp-2-residual-plots/",
  device = "png",
  dpi = 300
)


Results
Table 1: Summary table for Model 3. Estimates ± standard error (SE) of the effects of trial and rewarding object colour on the rewarding object preference from the generalized linear mixed effect model containing the effects Trial, Rewarding object colour, and their interaction effect (Trial X Rewarding object colour).
Factor Estimate Std. Error T statistic P value
Intercept -5.605 4.135 -1.355 0.175
Trial 33.483 17.905 1.870 0.061
Rewarding object colour 0.275 6.802 0.040 0.968
Rewarding object colour X Trial 14.014 21.721 0.645 0.519
Changes in object preference from an initial test before training to a final test after training. During training, fish were rewarded for approaching the blue object (blue squares and lines) or the green object (green squares and lines). At test, no food reward was present. Dashed line represents an equal preference for either object. Data are means ± 95% CI; lighter points and lines are data for each individual.

Figure 3: Changes in object preference from an initial test before training to a final test after training. During training, fish were rewarded for approaching the blue object (blue squares and lines) or the green object (green squares and lines). At test, no food reward was present. Dashed line represents an equal preference for either object. Data are means ± 95% CI; lighter points and lines are data for each individual.

test_data_model_emmeans <- emmeans(test_data_model_glm,
  specs = ~ trial:rewarding.object.colour
)
# Making vectors to represent means of interest from emmeans() output
blue0 <- c(1, 0, 0, 0)
blue21 <- c(0, 1, 0, 0)
green0 <- c(0, 0, 1, 0)
green21 <- c(0, 0, 0, 1)
# Set seed to prevent confidence intervals from changing when code is re-run
set.seed(123)
custom_contrasts <- contrast(test_data_model_emmeans,
  method = list(
    "21 blue - 0 blue" = blue21 - blue0,
    "21 green - 0 green " = green21 - green0,
    "21 green - 21 blue" = green21 - blue21,
    "0 green - 0 blue" = green0 - blue0
  ), adjust = "mvt"
) %>%
  summary(infer = TRUE)
Post-Hoc Results
Table 2: Table of post-hoc tests with a multivariate-t adjustment for multiple comparisons of a selected set of means. The numbers represent the initial test (0) and the final test (21). The colour corresponds to the identity of the rewarding object (blue for blue-rewarded guppies, green for green-rewarded guppies). Values are all rounded to 3 decimal places. CL = confidence limit.
Contrast Estimate Lower CL Upper CL df P Value
21 blue - 0 blue 33.483 -13.489 80.454 23 0.217
21 green - 0 green 47.497 15.235 79.758 23 0.003
21 green - 21 blue 14.289 -39.828 68.406 23 0.872
0 green - 0 blue 0.275 -17.570 18.121 23 1

Model 4 - Is there a difference in feeding attempts between treatments?

A discrepancy in reinforcement between treatments may influence performance on a final preference test. To see whether there was a difference in feeding between treatments we counted the number of trials in which an individual fish ate throughout all of training and compared the feeding counts between treatments. To do this we fit a generalized linear model with a negative binomial distribution. The response variable ‘feeding count’ is a sum of the number of trials in which a guppy ate.

  • Response variable: feeding.count is the number of trials in which an individual fish ate
  • Fixed effect: rewarding.object.colour is the identity of the rewarding object (blue or green)
feeding_data_model <-
  glm.nb(feeding.count ~ rewarding.object.colour,
    data = feeding_data 
  )
simulationOutput <- simulateResiduals(fittedModel = feeding_data_model)
plot(simulationOutput)

# Saving plot to figs directory
ggsave(
  filename = "exp-2-model-4-residual-plot.png",
  plot = (plot(simulationOutput)),
  path = "figs/exp-2/exp-2-residual-plots/",
  device = "png",
  dpi = 300
)


Results
Factor Estimate Std. Error T statistic P value
Intercept 2.612 0.176 14.834 0.000
Rewarding object colour 0.045 0.248 0.181 0.857

We found no significant difference in the number of trials individuals fed between green-rewarded and blue-rewarded fish (Figure 4, p = 0.857). Removing the one individual that did not feed during training does not change this result (p = 0.577).

Average number of trials in which a fish fed during training. Data are means ± 95% confidence intervals with probability density functions of the data to the right of the raw data.

Figure 4: Average number of trials in which a fish fed during training. Data are means ± 95% confidence intervals with probability density functions of the data to the right of the raw data.

Model 5 - Does the learned colour preference generalize

generalization_data_model <-  
  glmmTMB(rewarding.object.preference ~  
            trial * rewarding.object.colour + (1 |id) +
            diag(0 + trial |id),
  data = test_data %>% filter(trial == "23" | trial == "25" | trial == "21" | trial == "0" | trial == "27" | trial == "29") %>% filter(id != "8b"),
  family = gaussian
  )
emmeans(generalization_data_model,
        specs = trt.vs.ctrl ~ rewarding.object.colour:trial,
        by = "rewarding.object.colour")
## $emmeans
## rewarding.object.colour = blue:
##  trial emmean    SE df lower.CL upper.CL
##  0      -8.67  4.82 70   -18.28    0.941
##  21     38.76 14.00 70    10.84   66.682
##  23     20.19 17.05 70   -13.82   54.198
##  25      5.40 20.60 70   -35.68   46.482
##  27      8.00 20.41 70   -32.71   48.701
##  29     19.65 10.79 70    -1.88   41.174
## 
## rewarding.object.colour = green:
##  trial emmean    SE df lower.CL upper.CL
##  0      -5.33  4.51 70   -14.32    3.659
##  21     42.18 13.09 70    16.06   68.293
##  23      8.22 15.95 70   -23.59   40.034
##  25     68.25 19.27 70    29.82  106.677
##  27     44.95 19.09 70     6.87   83.023
##  29     76.47 10.10 70    56.34   96.609
## 
## Confidence level used: 0.95 
## 
## $contrasts
## rewarding.object.colour = blue:
##  contrast estimate   SE df t.ratio p.value
##  21 - 0       47.4 14.8 70 3.204   0.0093 
##  23 - 0       28.9 17.7 70 1.628   0.3528 
##  25 - 0       14.1 21.2 70 0.665   0.9004 
##  27 - 0       16.7 21.0 70 0.795   0.8456 
##  29 - 0       28.3 11.8 70 2.396   0.0790 
## 
## rewarding.object.colour = green:
##  contrast estimate   SE df t.ratio p.value
##  21 - 0       47.5 13.8 70 3.430   0.0047 
##  23 - 0       13.5 16.6 70 0.817   0.8347 
##  25 - 0       73.6 19.8 70 3.718   0.0019 
##  27 - 0       50.3 19.6 70 2.563   0.0529 
##  29 - 0       81.8 11.1 70 7.399   <.0001 
## 
## P value adjustment: dunnettx method for 5 tests
simulationOutput <- simulateResiduals(fittedModel = generalization_data_model)
plot(simulationOutput)

# Saving plot to figs directory
ggsave(
  filename = "exp-2-model-5-residual-plot.png",
  plot = (plot(simulationOutput)),
  path = "figs/exp-2/exp-2-residual-plots/",
  device = "png",
  dpi = 300
)
## Saving 7 x 5 in image


ESM Models

In this section we describe models not included in the main text.

Addressing feeding count

The models described in this section were run to address the potential role of feeding count on test performance. The concern was that the results may be explained by differential levels of reinforcement between the groups or between individuals during training. To ensure our results were robust to matters arising from feeding count variation we:

  1. Removed an individual that did not feed and re-ran the analysis in Model 3
  2. Included feeding count as a co-variate and re-ran the analysis in Model 3

ESM Model 1 - Removing individual that did not feed

There is a fish that did not eat during any trials, however removing this individual does not change the conclusions

test_feeding_data_low_feeders_removed <-
  test_feeding_data %>% filter(feeding.count > 0)
Model
test_feeding_data_low_feeders_removed_model <-
  glmmTMB(rewarding.object.preference ~
  trial * rewarding.object.colour +
    diag(0 + rewarding.object.colour:trial | id),
  data = test_feeding_data_low_feeders_removed,
  family = gaussian
  )
## Warning in fitTMB(TMBStruc): Model convergence problem; non-positive-definite
## Hessian matrix. See vignette('troubleshooting')
simulationOutput <- simulateResiduals(fittedModel = test_feeding_data_low_feeders_removed_model)
plot(simulationOutput)

# Saving plot to figs directory
ggsave(
  filename = "exp-2-ESM-model-1-residual-plot.png",
  plot = (plot(simulationOutput)),
  path = "figs/exp-2/exp-2-residual-plots/",
  device = "png",
  dpi = 300
)


Results
Factor Estimate Std. Error T statistic P value
Intercept -8.654 13.248 -0.653 0.514
Trial 47.423 20.881 2.271 0.023
Rewarding object colour 28.836 19.551 1.475 0.14
Rewarding object colour X Trial 14.032 20.824 0.674 0.5

ESM Model 2 - Including feeding count as a covariate

With this next model we looked to see whether including the amount of trials an individual fed as a covariate in the model changes the conclusions.

  • Response variable: rewarding.object.preference is the time (seconds) spent near the rewarding object subtracted by the time spent near the unrewarding object
  • Fixed effect: rewarding.object.colour is the identity of the rewarding object (blue or green)
  • Fixed effect: trial is the number of the training trial. In this model it is supplied as an integer
  • Random effect: id is the identity of the individual fish
  • Covariate: feeding.count is the number of trials in which an individual fish ate
test_data_feeding_controlled_model <-
  glmmTMB(rewarding.object.preference ~
  trial * rewarding.object.colour + feeding.count + (1 | id) +
    diag(0 + rewarding.object.colour * trial | id),
  data = test_feeding_data,
  family = gaussian
  )
## Warning in fitTMB(TMBStruc): Model convergence problem; non-positive-definite
## Hessian matrix. See vignette('troubleshooting')
simulationOutput <- simulateResiduals(fittedModel = test_data_feeding_controlled_model)
plot(simulationOutput)

# Saving plot to figs directory
ggsave(
  filename = "exp-2-ESM-model-2-residual-plot.png",
  plot = (plot(simulationOutput)),
  path = "figs/exp-2/exp-2-residual-plots/",
  device = "png",
  dpi = 300
)


Results
Table 3: Summary table for a modification of Model 3. This model is the same as that described in Table S1 except it includes feeding count as a covariate. Estimates ± standard error (SE) of the effects of trial and rewarding object colour of the rewarding object colour from the generalized linear mixed effect model containing the effects (Trial, Rewarding object colour, and their interaction effect).
Factor Estimate Std. Error T statistic P value
Intercept -25.524 16.623 -1.535 0.125
Trial 33.482 18.724 1.788 0.074
Rewarding object colour 24.973 18.725 1.334 0.182
Feeding count -0.952 18.724 -0.051 0.959
Rewarding object colour X Trial 8.633 18.724 0.461 0.645

The main results do not change if I control for feeding count. The above table is the output feeding controlled model. Below we have the output table for the non-feeding-count controlled model from Model 3.

Factor Estimate Std. Error T statistic P value
Intercept -5.605 4.135 -1.355 0.175
Trial 33.483 17.905 1.870 0.061
Rewarding object colour 0.275 6.802 0.040 0.968
Rewarding object colour X Trial 14.014 21.721 0.645 0.519

In both models the p-values are similar. While the effect of feeding count is not significant (p = 0.959) the effect of feeding count trends in the expected direction in our feeding count controlled model. As feeding count increases the preference for the rewarding object colour also increases.


Preference as a percentage

A reviewer of our manuscript asks

Why the authors did not (also) exploit a percentage preference, which is commonly used?

To determine whether our results are robust despite our different measure we also ran an analysis with the percentage preference as ESM Model 3. The results from the main experiment remain unchanged when we do this.

ESM Model 3 - Percentage preference for the rewarded object during testing depending on treatment

Since our data are continuous proportions we used a generalized linear mixed effect model with a beta distribution.

prop_test_data_model_glm <-  
  glmmTMB(prop.rewarding.object.preference ~  
            trial * rewarding.object.colour + (1|id), 
  data = test_data %>% filter(trial == "21" | trial == "0"), 
  family = beta_family(link="logit")
  )
simulationOutput <- simulateResiduals(
  fittedModel = prop_test_data_model_glm,
  n = 1000
)
plot(simulationOutput)

# Saving plot to figs directory
ggsave(
  filename = "exp-2-esm-model-3-residual-plot.png",
  plot = (plot(simulationOutput)),
  path = "figs/exp-2/exp-2-residual-plots/",
  device = "png",
  dpi = 300
)


Results
Table 4: Summary table for ESM Model 3. Estimates ± standard error (SE) of the effects of trial and rewarding object colour on the rewarding object preference from the generalized linear mixed effect model containing the effects Trial, Rewarding object colour, and their interaction effect (Trial X Rewarding object colour).
Factor Estimate Std. Error T statistic P value
Intercept -0.142 0.178 -0.800 0.424
Trial 0.401 0.252 1.589 0.112
Rewarding object colour 0.024 0.252 0.095 0.924
Rewarding object colour X Trial 0.276 0.359 0.769 0.442
Changes in proportional object preference from an initial test before training to a final test after training. During training, fish were rewarded for approaching the blue object (blue squares and lines) or the green object (green squares and lines). At test, no food reward was present. Dashed line represents an equal preference for either object. Data are means ± 95% CI; lighter points and lines are data for each individual.

Figure 5: Changes in proportional object preference from an initial test before training to a final test after training. During training, fish were rewarded for approaching the blue object (blue squares and lines) or the green object (green squares and lines). At test, no food reward was present. Dashed line represents an equal preference for either object. Data are means ± 95% CI; lighter points and lines are data for each individual.

Post-Hoc Comparisons
prop_test_data_model_emmeans <- emmeans(prop_test_data_model_glm,
  specs = ~ trial:rewarding.object.colour, type = "response"
)
# Set seed to prevent confidence intervals from changing when code is re-run
set.seed(123)
prop_custom_contrasts <- contrast(prop_test_data_model_emmeans,
  method = list(
    "21 blue - 0 blue" = blue21 - blue0,
    "21 green - 0 green " = green21 - green0,
    "21 green - 21 blue" = green21 - blue21,
    "0 green - 0 blue" = green0 - blue0
  ), adjust = "mvt"
) %>%
  summary(infer = TRUE)
Results
Table 5: Table of post-hoc tests with a multivariate-t adjustment for multiple comparisons of a selected set of means. The numbers represent the initial test (0) and the final test (21). The colour corresponds to the identity of the rewarding object (blue for blue-rewarded guppies, green for green-rewarded guppies). Values are all rounded to 3 decimal places. CL = confidence limit.
Contrast Odds ratio Lower CL Upper CL df P Value
21 blue / 0 blue 1.494 0.772 2.890 26 0.344
21 green / 0 green 1.968 1.008 3.843 26 0.046
21 green / 21 blue 1.350 0.690 2.639 26 0.593
0 green / 0 blue 1.024 0.530 1.978 26 1

Miscellaneous descriptive statistics

After trial 21 the guppies were all weighed. The weights ranged from 0.1 to 0.31 grams. Guppies weighed on average 0.17 grams. Blue-trained guppies weighed more than green-trained guppies (0.19 grams vs 0.14 grams) but this difference was not statistically significant.

Factor Estimate Std. Error T statistic P value
(Intercept) 0.189 0.022 8.770 0.000
rewarding.object.colourgreen -0.045 0.030 -1.479 0.161

Packages used

The analyses on this page were done with R version 3.6.2 (2019-12-12) and with functions from packages listed in Table 6. This page was written in Rmarkdown and rendered with knitr. To see the full list of dependencies for all packages used as well as their versions please visit the How to Reproduce the Results section of the README.

Table 6: All packages used for this analysis. The dependencies for these packages as well as their versions can be found in the README file.
Package Version Reference
broom 0.5.5 David Robinson and Alex Hayes (2020). broom: Convert Statistical Analysis Objects into Tidy Tibbles. R package version 0.5.5. https://CRAN.R-project.org/package=broom
broom.mixed 0.2.6 Ben Bolker and David Robinson (2020). broom.mixed: Tidying Methods for Mixed Models. R package version 0.2.6. https://CRAN.R-project.org/package=broom.mixed
carData 3.0.3 John Fox, Sanford Weisberg and Brad Price (2019). carData: Companion to Applied Regression Data Sets. R package version 3.0-3. https://CRAN.R-project.org/package=carData
cowplot 1.0.0 Claus O. Wilke (2019). cowplot: Streamlined Plot Theme and Plot Annotations for ‘ggplot2.’ R package version 1.0.0. https://CRAN.R-project.org/package=cowplot
DHARMa 0.3.3.0 Florian Hartig (2020). DHARMa: Residual Diagnostics for Hierarchical (Multi-Level / Mixed) Regression Models. R package version 0.3.3.0. http://florianhartig.github.io/DHARMa/
dplyr 1.0.3 Hadley Wickham, Romain François, Lionel Henry and Kirill Müller (2021). dplyr: A Grammar of Data Manipulation. R package version 1.0.3. https://CRAN.R-project.org/package=dplyr
effects 4.1.4 John Fox and Sanford Weisberg (2019). An R Companion to Applied Regression, 3rd Edition. Thousand Oaks, CA http://tinyurl.com/carbook
emmeans 1.5.1 Russell Lenth (2020). emmeans: Estimated Marginal Means, aka Least-Squares Means. R package version 1.5.1. https://CRAN.R-project.org/package=emmeans
ggplot2 3.3.3 H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.
ggpubr 0.2.5 Alboukadel Kassambara (2020). ggpubr: ‘ggplot2’ Based Publication Ready Plots. R package version 0.2.5. https://CRAN.R-project.org/package=ggpubr
glmmTMB 1.0.0 Mollie E. Brooks, Kasper Kristensen, Koen J. van Benthem, Arni Magnusson, Casper W. Berg, Anders Nielsen, Hans J. Skaug, Martin Maechler and Benjamin M. Bolker (2017). glmmTMB Balances Speed and Flexibility Among Packages for Zero-inflated Generalized Linear Mixed Modeling. The R Journal, 9(2), 378-400.
googledrive 1.0.1 Lucy D’Agostino McGowan and Jennifer Bryan (2020). googledrive: An Interface to Google Drive. R package version 1.0.1. https://CRAN.R-project.org/package=googledrive
knitr 1.30 Yihui Xie (2020). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.30.
lme4 1.1.21 Douglas Bates, Martin Maechler, Ben Bolker, Steve Walker (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1-48. doi:10.18637/jss.v067.i01.
lmerTest 3.1.1 Kuznetsova A, Brockhoff PB, Christensen RHB (2017). “lmerTest Package:Tests in Linear Mixed Effects Models.” Journal of StatisticalSoftware, 82(13), 1-26. doi: 10.18637/jss.v082.i13 (URL:https://doi.org/10.18637/jss.v082.i13).
magrittr 2.0.1 Stefan Milton Bache and Hadley Wickham (2020). magrittr: A Forward-Pipe Operator for R. R package version 2.0.1. https://CRAN.R-project.org/package=magrittr
MASS 7.3.51.4 Venables, W. N. & Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. Springer, New York. ISBN 0-387-95457-0
Matrix 1.2.18 Douglas Bates and Martin Maechler (2019). Matrix: Sparse and Dense Matrix Classes and Methods. R package version 1.2-18. https://CRAN.R-project.org/package=Matrix
R 3.6.2 R Core Team (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
report 0.2.0 Makowski, D., Ben-Shachar, M.S., Patil, I. & Lüdecke, D. (2020). Automated reporting as a practical tool to improve reproducibility and methodological best practices adoption. CRAN. Available from https://github.com/easystats/report. doi: .
rmarkdown 2.6.4 JJ Allaire and Yihui Xie and Jonathan McPherson and Javier Luraschi and Kevin Ushey and Aron Atkins and Hadley Wickham and Joe Cheng and Winston Chang and Richard Iannone (2021). rmarkdown: Dynamic Documents for R. R package version 2.6.4. URL https://rmarkdown.rstudio.com.
stringr 1.4.0 Hadley Wickham (2019). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.4.0. https://CRAN.R-project.org/package=stringr
tidyr 1.0.2 Hadley Wickham and Lionel Henry (2020). tidyr: Tidy Messy Data. R package version 1.0.2. https://CRAN.R-project.org/package=tidyr

References

Kuznetsova A, Brockhoff PB, Christensen RHB. 2017. lmerTest Package: Tests in Linear Mixed Effects Models. Journal of Statistical Software 82:1–26. doi:10.18637/jss.v082.i13
Noldus LPJJ, Spink AJ, Tegelenbosch RAJ. 2001. EthoVision: A versatile video tracking system for automation of behavioral experiments. Behavior Research Methods, Instruments, & Computers 33:398–414. doi:10.3758/BF03195394